ICRCS at Intent2: Applying Rough Set and Semantic Relevance for Subtopic Mining
نویسندگان
چکیده
The target of the subtopic mining subtask of NTCIR-10 Intent-2 Task is to return a ranked list of subtopics. To this end, this paper proposes a method to apply the rough set theory for redundancy reduction in subtopic mined from webpages. Besides, semantic similarity is used for subtopic relevance measure in the re-ranking process, computed with semantic features extracted by NLP tools and semantic dictionary. By using the reduction concept of rough set, we first construct rough set based model (RSBM) for subtopic mining. Next, we combine the rough set theory and semantic relevance into a new model (RS&SRM). Evaluation results show the effectiveness of our approach compared with a baseline frequency term based model (FTBM). The best performance is achieved by RS&SRM, with I-rec of 0.4046, D-nDCG of 0.4413 and D#-nDCG of 0.4229 on the subtask of Chinese subtopic mining.
منابع مشابه
Query Subtopic Mining Exploiting Word Embedding for Search Result Diversification
Understanding the users’ search intents through mining query subtopic is a challenging task and a prerequisite step for search diversification. This paper proposes mining query subtopic by exploiting the word embedding and short-text similarity measure. We extract candidate subtopic from multiple sources and introduce a new way of ranking based on a new novelty estimation that faithfully repres...
متن کاملLIA at the NTCIR-10 INTENT Task
This paper describes the participation of the LIA team in the English Subtopic Mining subtask of the NTCIR-10 INTENT2 Task. The goal of this task was to specialize or disambiguate web search queries by identifying the different subtopics that could refer to these queries. Our motivation was to take a conceptual approach, therefore representing the query by a set of concepts before identifying t...
متن کاملApplication of Rough Set Theory in Data Mining for Decision Support Systems (DSSs)
Decision support systems (DSSs) are prevalent information systems for decision making in many competitive business environments. In a DSS, decision making process is intimately related to some factors which determine the quality of information systems and their related products. Traditional approaches to data analysis usually cannot be implemented in sophisticated Companies, where managers ne...
متن کاملQuery Subtopic Mining by Combining Multiple Semantics
Query subtopic mining aims to find aspects to represent people’s potential intents for a query. Clustering query reformulations is the most common approach for subtopic mining these days. However, there are some challenges that the existing approaches have to face in finding both relevant and diverse subtopics, such as term mismatch and data sparseness. In this paper, a novel semantic represent...
متن کاملTUTA1 at the NTCIR-10 Intent Task
In NTCIR-10, we participated in the subtask of Subtopic Mining. We classify test topics into two types: role-explicit topic and roleimplicit topic. According to the topic type, we devise different approaches to perform subtopic mining. Specifically, for roleexplicit topics, we propose an approach of modifier graph based subtopic mining. The key idea is that: The modifier graph corresponding to ...
متن کامل